超越基本搜尋：解決語義相似度的局限性

超越相似度

當「80%問題」發生在基礎語義搜尋對簡單查詢有效，但在邊際情況下失敗時。僅依賴相似度進行搜尋時，向量資料庫通常會回傳數值上最接近的片段。然而，若這些片段幾乎完全相同，大語言模型（LLM）將收到重複資訊，浪費有限的上下文空間，錯過更廣闊的視角。

進階檢索支柱

最大邊際相關性（MMR）：不只選擇最相似的項目，MMR 在相關性與多樣性之間取得平衡，以避免重複。 $MMR = \text{argmax}_{d \in R \setminus S} [\lambda \cdot \text{sim}(d, q) - (1 - \lambda) \cdot \max_{s \in S} \text{sim}(d, s)]$
自我查詢：利用大語言模型（LLM）將自然語言轉換為結構化元數據篩選條件（例如按「第3講」或「來源：PDF」篩選）。
上下文壓縮：縮小檢索到的文件，僅提取與查詢相關的「高營養值」片段，節省令牌。

冗餘陷阱

給大語言模型（LLM）提供同一段落的三個版本，並不會讓它更聰明——只會使提示更昂貴。多樣性是打造「高營養值」上下文的關鍵。

TERMINALbash — 80x24

> Ready. Click "Run" to execute.

Knowledge Check

You want your system to answer "What did the instructor say about probability in the third lecture?" specifically. Which tool allows the LLM to automatically apply a filter for { "source": "lecture3.pdf" }?

ConversationBufferMemory

Self-Querying Retriever

Contextual Compression

MapReduce Chain

Challenge: The Token Limit Dilemma

Apply advanced retrieval strategies to solve a real-world constraint.

You are building a RAG system for a legal firm. The documents retrieved are 50 pages long, but only 2 sentences per page are actually relevant to the user's specific query. The standard "Stuff" chain is throwing an OutOfTokens error because the context window is overflowing with irrelevant text.

Step 1

Identify the core problem and select the appropriate advanced retrieval tool to solve it without losing specific nuances.

Problem: The context window limit is being exceeded by "low-nutrient" text surrounding the relevant facts.

Tool Selection:ContextualCompressionRetriever

Step 2

What specific component must you use in conjunction with this retriever to "squeeze" the documents?

Solution: Use an LLMChainExtractor as the base for your compressor. This will process the retrieved documents and extract only the snippets relevant to the query, passing a much smaller, highly concentrated context to the final prompt.